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Exosome-mediated signal transportation plays a variety of critical roles in cancer progression and 
metastasis. From the aspect of cancer diagnosis, circulating exosomes are ideal resources of biomarkers 
because molecular features of tumor cells are transcribed on them. However, isolating pure exosomes from 
body fluids is time-consuming and still major challenge to be addressed for comprehensive profiling of 
exosomal proteins and miRNAs. Here we constructed anti-CD9 antibody-coupled highly porous monolithic 
silica microtips which allowed automated rapid and reproducible exosome extraction from multiple clinical 
samples. We applied these tips to explore lung cancer biomarker proteins on exosomes by analyzing 46 
serum samples. The mass spectrometric quantification of 1,369 exosomal proteins identified CD91 as a lung 
adenocarcinoma specific antigen on exosomes, which was further validated with CD9-CD91 exosome 
sandwich ELISA measuring 212 samples. Our simple device can promote not only biomarker discovery 
studies but also wide range of omics researches about exosomes. 



Lung cancer is the leading cause of cancer- related mortality worldwide, accounting for 1,475,117 deaths in 
2011 (Global Health Observatory Data Repository, World Health Organization). The high mortality is 
mainly attributable to a late-stage diagnosis and the lack of effective treatments. Indeed, by means of current 
cancer screening tests, only 30% of patients are diagnosed at an early disease stage and present surgically 
resectable tumors'. Therefore development of novel biomarkers and establishment of blood-based early detection 
system for lung cancer is crucial in order to improve clinical outcome and overall survival rate. 

Recently biological significance and clinical utility of exosomes have been extensively discussed. Particularly 
contribution of tumor- derived exosomes to the formation of metastatic microenvironments is one of the most 
fundamental functions of them, which would provide a better understanding for cancer metastasis and even new 
therapeutic strategies to prevent metastasis'' ''. Exosome-mediated delivery of therapeutic RNAs has been already 
in a pioneering stage for cancer treatment^ '". In the field of cancer diagnosis, exosomes are also fascinating targets 
for biomarker discovery due to their molecular characteristics'"'. In principle, a set of molecules expressed in 
original solid tumor cells would be detectable as exosomal components in blood circulation. Despite the theor- 
etical feasibility of exosomal biomarkers, difficulties in exosome isolation from biological fluids have significantly 
hindered effective discovery of biomarker candidates. In fact, although ultracentrifugation-based methods are the 
most common strategies to isolate exosomes from serum samples'", the reproducibility, processing time, and 
purity are not appropriate for biomarker screening studies dealing with a lot of clinical samples quantitatively". 

In the present study, we established an antibody- assisted exosome purification tips by immobilizing anti-CD9 
antibody to Mass Spectrometric Immunoassay (MSIA) monolith pipette tips. This multi-channeled platform effec- 
tively streamlined proteome-wide mass spectrometric profiling of serum exosomes and allowed accurate statistical 
identification of lung cancer-specific exosomal proteins. We further constructed exosome sandwich ELISA assays for 
large-scaled replication study to validate screening reliability for an identified exosome surface antigen CD91. 

Results 

Isolation of serum exosomes by anti-CD9-MSIA tips. To perform reproducible and high-purity separation of 
exosomes from serum, we employed the antibody-immobilized low back pressure monolithic tips on automated 
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Figure 1 | Schematic view of exosomal biomarker discovery workflow, (a) Magnified picture of anti-CD9 MSIA tips (left) and a dedicated holding 
fixture (right). Pictures were taken by authors, (b) Exosome fractions were purified from a pooled serum sample using 6 independent anti-CD9-MSIA 
tips and analyzed by LC/MS/MS in triplicated measurements. The coefficient of variation (CV) of peak intensities corresponding to CD9 155- 170 peptide 
(GLAGGVEQFISDICPK, m/z = 845.9266) orCDSl 149-171 peptide (TFHETLDccGSSTLTALTTSVLK, m/z = 848.0733) was shown, (c) Exosomes were 
isolated from 46 serum samples by anti-CD9 antibody-coupled monolith tips (anti-CD9-MSIA tips) on 12-channel automatic pipetting platform. The 
enriched exosome fractions were analyzed by LC/MS/MS and subjected to label-free quantification analysis by RefinerMS software on the Expressionist 
proteome server system. The quantified peptides underwent 2-step statistical analysis, composed of ANOVA and feature elimination method, and finally 
extracted biomarker candidate peptides were identified with Sequest database search. The identification threshold was set at false discovery rate 
(FDR) < 1%. 



12-channel pipette system (Figure la), which allowed 30 minutes 
isolation of exosomes from 12 serum samples simultaneously. Here 
we selected a tetraspanin molecule CD9 as a target of exosome- 
capturing antibody due to its strong expression on the surface of 
exosomes secreted from diverse cell types'^. In order to evaluate the 
reproducibility of anti-CD9-MSIA tips, exosomes were purified from 
a pooled serimi sample using 6 independent tips and analyzed by LC/ 



MS/MS in triplicated measurements (Figure lb). The coefficient of 
variation (CV) of peak area corresponding to CD9 155-170 peptide 
(GLAGGVEQFISDICPK, m/z = 845.9266) or CD81 149-171 peptide 
(TFHETLDCCGSSTLTALTTSVLK, m/z = 848.0733), which was 
also known as a typical exosome marker molecule, was 2.49% or 
2.87%, respectively, indicating that the error level in relative 
quantification analysis was small enough for reliable biomarker 
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Table 1 | Clinical Information of serum samples 
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identification. Then we next isolated serum exosomes from 10 
normal controls (NC), 10 interstitial pneumonia patients (IP), 14 
lung adenocarcinoma patients (ADC), and 12 lung squamous cell 
carcinoma patients (SCC) using anti-CD9-MSIA tips. Purified 
exosomes were individually analyzed by LC/MS/MS system and 
subjected to statistical analysis as shown in Figure Ic. 

Proteome-wide overview of human serum exosomes. The LC/MS/ 
MS analysis of 46 serum samples (Table 1) and subsequent Sequest 
database search identified 1,369 non-redundant proteins (FDR < 
1%, Supplementary Table 1). To assess the purity of anti-CD9- 



MSIA tip eluates, identified proteins were classified according to 
subcellular localizations (Figure 2a). The Cellular Component 
distribution by DAVID GO analysis illustrated highly- enriched 
701 intracellular proteins (51.2%) and 290 plasma membrane 
proteins (21.2%), whereas only 135 extracellular (secreted) proteins 
(9.8%) were identified. These values clearly represented efficient 
enrichment of exosomes bearing original cell-derived cellular 
components. Importantly, most of serum abundant proteins such 
as albumin and IgG were effectively washed out during MSIA 
purification steps, which often hindered the sensitive detection of 
minor exosomal proteins. Moreover to elucidate physiological 
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Figure 2 | Proteome-wide overview of 1,369 identified exosomal proteins, (a) Distribution of protein subcellular localization was shown in a pie chart. 

(b) The Fisher Exact Statistics in DAVID system was used for functional annotation clustering analysis. The 10 enriched functions detected in 1,369 
exosomal proteins were shown with Expression Analysis Systematic Explorer (EASE) scores. 
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Figure 3 | Two-step statistical selection of biomarker candidates. For the first stage, 3-group ANOVA was performed to compare NC, IP, and ADC 
groups (a) or NC, IP, and SCC groups (b). Peptides satisfying the criterion p < 0.001 were used for the second Ranking selection. To calculate the 
minimum set of biomarkers which could provide the minimum misclassification rate, cross validation-based support vector machine-recursive feature 
elimination (SVM-RFE) method was used for comparison of NC, IP, and ADC groups (c) . Similarly, SVM-SVM method was employed for comparison of 
NC, IP, and SCC groups (d) . The number of selected biomarker candidates and the misclassification rate were shown. NC, normal control; IP, interstitial 
pneumonia; ADC, adenocarcinoma; SCC, squamous cell carcinoma. 



functions of serum exosomes, Expression Analysis Systematic 
Explorer (EASE) scores were calculated"'''' (Figure 2b). This 
functional estimation suggested the possible association of serum 
exosomes with immune regulations, cell-to-cell interactions, and 
stimulation responses, in addition to vesicle transport. These data 
would contribute to new revelations about the biological functions of 
not only tumor-derived exosomes but also normal exosomes. 

Statistical identification of exosomal biomarkers for lung cancer. 

The label-free quantification analysis on the Expressionist RefinerMS 
module (Figure Ic and Supplementary Figure I) quantified 113,582 
non-redundant peptides from 46 serum samples. In the first 
statistical selection, 3-group ANOVA was used to roughly extract 
signature peptides specific to ADC patients (230 peptides, p < 
0.001) or SCC patients (316 peptides, p < 0.001) as shown in 
Figure 3 a or 3b, respectively. Here, to identify lung cancer specific 
exosomal biomarkers which don't react with inflammatory lung 
diseases such as interstitial pneumonia, IP patients were also 
considered as a non-cancer control group. For the second stage, 
cross validation-based feature elimination method was employed to 
compute the minimum biomarker sets which provided the least 



misclassification rates. Here support vector machine recursive 
feature elimination (SVM-RFE) method or SVM-SVM method 
defined 181 or 32 peptides as final candidate biomarker sets 
demonstrating 90.9% or 100% true prediction rate for ADC or 
SCC patients group (Figure 3c or 3d, respectively). By referring to 
protein identification data, 20 peptides derived from 18 proteins were 
identified (Figure 4 and Table 2). Among them, CD91, Integrin 
alpha-lib, and CD317 were selected as favorable exosomal 
biomarker candidates because their expected localization was on 
the surface of exosomes and could be measured by exosome 
sandwich ELISA system (Figure 5a). 

Validation experiment for CD91 by exosome sandwich ELISA. To 

assess the quantitative reproducibility of the label-free quantification 
results in our single-run screening analysis, as well as the clinical 
usefulness of a candidate biomarkers, we conducted further valida- 
tion study by exosome sandwich ELISA using 212 independent 
serum samples (Table 1). Among three candidate biomarker 
proteins, we eventually succeeded to obtain good antibody and 
construct ELISA assay only for CD91. In this assay, we utilized 
anti-CD9 antibody as an exosome-capture antibody and 
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Figure 4 | Identified 18 exosomal biomarker candidates. The LC/MS/MS signal intensities for 18 biomarker candidates acquired from 46 cases were 
displayed in box plots. The UniProtKB entry protein names accompanied with amino acid numbers were shown over the box plots. N, normal control; IP, 
interstitial pneumonia; ADC, adenocarcinoma; SCC, squamous cell carcinoma. 



biotinylated anti-CD9 or biotinylated anti-CD91 antibody as a 
detection antibody (Figure 5a). Since serum exosome concentra- 
tions, measured by CD9-CD9 sandwich ELISA, had drastic 
individual variability (Figure 5b), the measurements of CD9-CD91 
sandwich ELISA were normalized by exosome concentrations 
(denoted by U/exosome in Figure 5c). We also tested a classical 
clinical biomarker CEA in the same sample set (Figure 5d) and 
compared the diagnostic efficacy to that of exosomal CD91. When 
we set the cut-off value at 2.04 U/exosome for exosomal CD91 and 
5.0 ng/ml for CEA, exosomal CD91 showed significantly higher 
sensitivity for detecting stage-I, II ADC patients (54.5%) compared 
with CEA (22.7%), while the detection power of exosomal CD91 in 
stage-Ill, IV ADC patients (61.4%) was similar with that or CEA 
(66.3%). The false positive rate of exosomal CD91 in the control 
group (NC and IP, n = 73) were 11.0%, while that of CEA was 
8.2%. These results indicated that exosomal CD91 possessed better 
potential for the early detection of lung ADC compared with CEA. 
On the other hand, exosomal CD91 might not be useful in the 
detection of SCC patients. In addition, we constructed a logistic 
regression model to evaluate the advantage of a combination of 
exosomal CD91 and CEA. The ROC curve analysis using control 
samples (NC and IP, n = 73) and all ADC samples (n = 105) in 
Figure 6a-c revealed that the combination biomarker effectively 
improved all of sensitivity (71.4%), specificity (91.8%), and area 
under the curve (0.882), compared with each single biomarker. 



Additionally, effects of age or gender on exosomal CD91 concentra- 
tions were assessed. The results indicated that exosomal CD91 was 
affected by neither age (R^ = 0.0536, Supplementary Figure 2a) nor 
gender (p = 0.299, Supplementary Figure 2b). These results 
suggested that exosomal CD91 was an independent predictor of 
lung ADC which could complement diagnostic power of CEA. 

Discussion 

In recent years, physiological roles of secreted microvesicles, includ- 
ing exosomes, were actively studied mainly using cell culture super- 
natants (CCS). However studies on body fluid exosomes have been 
still at the stage of seeking appropriate technologies which allow 
high-purity isolation of exosomes. In fact, ultracentrifugal sedi- 
mentation methods, the most common techniques to purify exo- 
somes'^, can not remove large size proteins (e.g. a2-macroglobulin, 
IgM, complement factors, and so on), aggregated proteins, circulat- 
ing proteasome"" ", and vault ribonucleoprotein particles"* in serum 
because they have similar diameters (40 — 100 nm) and sedimenta- 
tion coefficient to exosomes. Owing to these major contaminants, 
serum-derived proteins often occupy a great part of identified pro- 
teins by LC/MS/MS analysis of ultracentrifugal sedimentation sam- 
ples (data not shown). In addition to the purity issue, throughput and 
reproducibility are also critical factors for the development of exo- 
some isolation methods because biomarker screening studies or 
therapeutic target discovery studies usually deal with lots of clinical 
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specimens. To satisfy these technological requirements, we estab- 
lished a rapid, reproducible, and high-quality isolation device by 
integrating exosome capture antibody, low pressure monolith tips, 
and 12-well automatic pipet. Since the MSIA tips are also compatible 
with commercial 96-well automatic pipetting workstations, our 
method is applicable for larger (> 100 cases) sample set studies. In 
the present study, we used anti-CD9 antibody to capture exosomes in 
both MSIA tip devices and exosome sandwich ELISA system. 
Although CD9 is one of the most well-known exosome markers, 
the expression level of CD9 might vary by tissues''^ or disease 
state^"'^'. These facts indicate that a single use of anti-CD9 antibody 
could provide proteome profiles for only a limited proportion of 
serum exosomes. Therefore effective combination of antibodies spe- 
cific to various exosome markers (e.g. CD63 or CD81) would 
improve the comprehensiveness of exosome profiling studies. 

From the proteome-wide biomarker screening of 46 serum sam- 
ples, we eventually identified 18 biomarker candidate proteins in 
Figure 4 and Table 2. Concerning subcellular localizations of them, 
6 proteins express in intracellular regions (BAIP2 in peripheral mem- 
brane area; copine- 1 on vesicle membrane; myeloperoxidase in lyso- 
some; CD91, integrin alpha-IIb, and CD317 on plasma membrane), 
however, other 13 proteins were known as major serum proteins. 
Although the number of identified extracellular proteins was small 
as shown in Figure 2a, the purification efficiency of anti-CD9-MSIA 
tip was stiU considered incomplete. In order for further improvement 
of this device, we're now optimizing the coating agents for monolith 
polymer in the tips to minimize non-specific binding of serum pro- 
teins. Because the amount of exosome-derived proteins is indeed 
ultratrace level compared to serum major proteins, the improvement 
of exosome purification efficiency could significantly increase the 
chances of detecting minor exosomal antigens associated with cancer 
development or progression. More effective and inexpensive exosome 
isolation method could also promote development of mass spectro- 
metric multiplexed biomarker diagnosis measuring two or more exo- 
somal biomarkers simultaneously by MRM/SRM assay. 

In this report, we showed that CD91 expression was significantly 
elevated on exosomes in especially lung ADC patients' sera. The 
detection power for limited number of early stage patients (n = 



22) was also higher than existing biomarker CEA (Figure 5c and 
5d), but so far we can conclude that the exosomal CD91 assay could 
detect at least a large cancer burden. For functional consideration of 
CD9 1 , this molecule is a type- 1 transmembrane receptor which med- 
iates ligand endocytosis in clathrin-coated pits and cargo trafficking 
to lysosomes^^'^''. Another fundamental role of CD91 is reported as a 
signaling receptor regulating cytokine secretion, phagocytosis and 
migration of cells in the immune system"''*'^^. Notably, with regard 
to cancer, no or very low expression of CD91 in lung cancer cells was 
observed in tissues from ADC patients with poor clinical outcome, 
while strong staining patterns of CD91 were observed in stromal cells 
surrounding cancer cells from 94/111 ADC patients^'. Combined 
with our results, high level of serum CD91 -expressing exosomes 
would be originally secreted from stromal cells surrounding lung 
cancer cells. Since construction of tumor microenvironments is 
one of the most well-established functions of exosomes', inhibiting 
production or function of CD91 -positive exosomes might lead to 
suppression of lung cancer progression. Therefore inhibitors of 
CD91, which interfere ligand binding, such as the receptor-assoc- 
iated protein (RAP), suramin, a2-macroglobulin, and lactoferrin^' 
would be useful for investigating the association of exosomal CD91 
and lung cancer. In contrast to CD91, another candidate of exosomal 
biomarker CD317 was already confirmed as a potential therapeutic 
target for lung cancer^". Due to its nature of lung cancer specific 
expression, chimeric or humanized antibody drugs were developed 
and tested in clinical trials^'. This evidence indicated that the lung 
ADC cells directly release high level of CD317-positive exosomes in 
blood circulation. 

Thus in addition to identification of biomarker candidates, our 
anti-CD9 MSIA tips can provide diverse novel knowledge about 
human exosomes. More comprehensive information will be also 
obtained by replacing the epitope of immobilized antibody with 
other exosomal surface antigens such as CD63, CD81, tetraspanin- 
9, or tetraspanin-14. 

Methods 

Serum samples. Serum samples from lung cancer patients (n = 165), interstitial 
pneumonia patients (n — 29), and normal controls (n — 64) were collected in 
Hiroshima University Hospital within the same period. All samples were collected 
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Figure 5 | Exosome sandwich ELISA-based validation experiment for CD91. (a) Principle of exosome sandwich ELISA. SA-HRP, streptavidin- 
horseradish peroxidase. Using 212 independent serum samples, exosomal CD91 and CEA concentrations were measured, (b) Serum exosome 
concentrations were determined by CD9-CD9 sandwich ELISA. (c) CEA concentrations were measured by commercial ELISA kits, (d) Exosomal CD91 
concentrations were determined by CD9-CD91 sandwich ELISA. The values were normalized with exosome concentrations calculated in (b). Red lines 
indicate the cut-off values for CEA at 5.0 ng/ml (c) or exosomal CD91 at 2.04 U/exosome (d). The sensitivity (Sens.) for each lung cancer sub-group and 
specificity (Spec.) were shown below the box plots. N, normal control; IP, interstitial pneumonia; ADC_1_2, stage-I, II adenocarcinoma; ADC_3_4, stage- 
Ill, IV adenocarcinoma; SCC_1_2, stage-I, II squamous cell carcinoma; SCC_3_4, stage-Ill, IV squamous ceU carcinoma. 



from untreated patients at the initial visit to hospital. The nonanticoagulated blood 
samples were collected and allowed to clot at room temperature for 1-2 hours. Sera 
were then separated by centrifugation at 1500 rpm for 15 min and stored frozen at 
— SO'^C. Written informed consents were obtained from aU participants. This study 
was approved by The Ethical Committee of RIKEN (Approval code: Yokohama H20- 
12) and The Ethical Committee of Hiroshima University Hospital. All experiments 
were performed in accordance with relevant guidelines and regulations. 

Exosome purification by anti-CD9 MSIA tips. AU procedures were performed on 
Novus-i 12-channeI electronic pipettes and adjustable pipette stand (Thermo Fisher 
Scientific, Waltham, Massachusetts, USA). The MSIA D.A.R.T.'s, Protein G tips 
(Thermo Fisher Scientific) were equilibrated with 300 \x\ PBS X 10 cycles prior to 
25 [il X 100 cycles of immobilization of 1 [ig anti-CD9 antibody (provided by 
Shionogi & Co., Ltd., Osaka, Japan) in 50 \x\ PBS. After crosslinking with 0.25 mM 
BS3 (Thermo Fisher Scientific) in PBS (100 ]i\ X 100 cycles), reaction was quenched 
with 50 mM ethanolamine-HCl (pH 8.0) (100 [il X 100 cycles). Following 300 p\ X 
10 cycles of equilibration in PBS, the anti-CD9-MSIA tips were incubated with 350 ]i\ 
of seven-fold diluted serum samples by 300 |j.l X 100 cycles pipetting. Tips were 
washed three times by 300 \i\ X 25 cycles pipetting in PBS. Finally captured exosomes 
were eluted by 20 \i\ X 100 cycles pipetting in 30 |j.l of [8M Urea, 50 mM ammonium 
bicarbonate] solution. 

After reduction with 5 mM TCEP at 37"C for 30 minutes and alkylation with 
25 mM iodoacetamide at room temperature for 45 minutes, samples were diluted 7 



times with 50 mM ammonium bicarbonate. Proteins were digested by Immobilized 
Trypsin beads (Thermo Fisher Scientific) in a 96-well filter plate with continuous 
shaking at 37'C for 6 hours. Tryptic digests were desalted by Oasis HLB 96-well 
|j.Elution Plate (Waters Corporation, Milford, Massachusetts, USA) and subjected to 
LC/MS/MS analysis. 

LC/MS/MS analysis. The dried peptide samples were resuspended in 2% acetonitrile 
with 0.1% trifluoroacetic acid and analyzed by LTQ-Orbitrap-Velos mass 
spectrometer (Thermo Fisher Scientific) combined with UltiMate 3000 RSLC nano- 
flow HPLC system (DIONEX Corporation, Sunnyvale, California, USA). Samples 
were separated on 75 ]im X 150 mm C^g tip-column (Nikkyo Technos, Tokyo, 
Japan) using solvent A [0.1% formic acid] and solvent B [0.1% formic acid in 
acetonitrile] with multistep linear gradient of solvent B 6.4 to 30% for 95 minutes and 
30 to 95% for 10 minutes at a flow rate 250 nl/min. The eluted peptides were ionized 
with the spray voltage 2000 V and MS data was acquired in a data- dependent 
fragmentation method in which the survey scan was acquired between m/z 400 to 
1600 at the resolution 60,000 with automatic gain control (AGC) target value of 1.0 X 
106 ion counts. The top-20 intense precursor ions in each survey scan were subjected 
to low resolution MS/MS acquisitions using normal CID scan mode with AGC target 
value of 5,000 ion counts in the linear ion trap. 

The protein identification analysis was performed by SEQUEST database search on 
Proteome Discoverer 1.3 software (Thermo Fischer Scientific). The MS/MS spectra 
were searched against human protein database SwissProt 2013_03 (20,255 sequences) 
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Figure 6 | ROC curve analysis for exosomal CD91 and CEA. ROC curves for CEA (a), exosomal CD91 (b), and logistic regression-based combination 
marker CEA + exosomal CD91 (c) were depicted by R. The diagnostic efficiencies between NC + IP (n — 73) and lung ADC patients (n — 105) were 
evaluated. The cut-off value was set at the point whose distance from the (sensitivity, specificity) — (1, 1) reached the minimum. The sensitivity (Sens), 
specificity (Spec), positive predictive value (PV4-), negative predictive value (PV-), and area under the curve (AUC) were shown on each graph. 



using search parameters as follows: Enzyme Name — Semitrypsin, Precursor Mass 
Tolerance — 10 ppm, Fragment Mass Tolerance — 0.8 Da, Dynamic Modification — 
Oxidation (Met), and Static Modification — Carbamidomethyl (Cys). False discovery 
rate of 0.01 and Peptide Rank of 1 were set for peptide identification filters. The gene 
ontology analysis was performed on EASE (Expression Analysis Systematic Explorer, 
version 2.0) (http://www.geneontology.org/) to compute the overrepresented func- 
tional categories in "Cellular Component," and "Molecular Function". 

Label-free quantification analysis on Expressionist server. The LC/MS/MS data set 
was loaded to Expressionist RefinerMS module (Genedata AG, Basel, Switzerland) for 
data processing and label-free quantification analysis. The whole workflow of 
RefinerMS software was shown in Supplementary Figure S 1 . The Spectrum Grid was 
set at every 10 data points on 2D MS chromatogram planes (x — m/z and y ~ RT). 
The Structure Removal was sequentially performed with RT — 2 scans and m/z — 6 
points in the first and second Chemical Noise Subtraction, respectively, followed by 
the third Subtraction using RT Window — 500 scans and QuantUe — 90%. After forth 
Subtraction with RT — 2 scans, signals with intensity < 2,000 were clipped off by 
Intensity Thresholding. Finally further Structure Removals with RT — 2 scans and 
m/z =^ 5 points were run as the fifth and sixth Chemical Noise Subtractions, 
respectively. 

Then Chromatogram Grid was set at every 10 scans on noise-subtracted data, 
followed by Chromatogram RT Alignment using parameters: m/z Window — 11 
points, RT Window — 11 scans. Gap Penalty — 1, RT Search Interval ~ 2 minutes, 
and Alignment Scheme — pairwise alignment based tree. Next, the Summed Peak 
Detection Activity detected the peaks on a temporarily- averaged chromatogram with 
parameters as follows: Summation Window — 20 scans. Overlap — 10, Minimum 
Peak Size — 6 scans. Maximum Merge Distance — 1 data points, Gap/Peak Ratio = 5, 
Method = curvature- based peak detection. Peak Refinement Threshold — 5, and 
Consistency Filter Threshold — 1. Finally Summed Isotope Clustering Activity 
grouped isotopic peaks derived from single molecule into an isotope cluster. Here 
parameters were used as follows: Minimum Charge — 1, Maximum Charge = 6, 
Maximum Missing Peaks — 0, First Allowed Gap Position — 10, Ionization — pro- 
tonation, RT Tolerance — 0.1 minute, m/z Tolerance = 0.01 Da, and Minimum 
Cluster Size Ratio — 0.5. 

Statistical analysis on Expressionist Analyst. Two-step statistical selection was 
employed for the effective identification of biomarker set. In the first stage, 3-group 
ANOVA was performed to roughly extract the candidates showing significantly 
distinct expression level among three clinical groups (p < 0.001). Next the minimum 
combination of biomarkers which provided the best classification rate was estimated 
by the Support Vector Machine-Recursive Feature Elimination (SVM-RFE) Ranking 
method in the Expressionist Analyst module (Genedata AG). SVM-RFE is an iterative 
algorithm that works backward from an initial set of statistical features. As a classifier, 
SVM was used for inferring decision rules. RFE was used as a ranking method which 
iteratively dropped the 10% peptides with the lowest weights in each step. Finally a 
Line Plot visualizer was returned displaying the average misclassification rate. Each 
classifier was represented by a line and the optimal classifier and biomarker peptide 
set size for the chosen ranking method could be read off 

Exosome sandwich ELISA assay. The 250 ng/well of anti-CD9 antibody was 
immobilized to Nunc MaxiSorp flat-bottom 96 well plate (Thermo Fischer Scientific). 
The blocking solution (150 )j.l/well of 5% BSA in PBS) was then added and incubated 
on the plate shaker at ambient temperature for 60 minutes. After 3 times wash with 



PBS, [5 111 serum + 95 ^il PBS] or [30 ]A serum + 70 |il PBS] was loaded to the upper 
48 weUs or the lower 48 weUs, respectively. Following 5 hours incubation, plates were 
washed three times by PBS. The 100 (J.l/well of biotinylated anti-CD9 antibody 
(125 ng/ml) or biotinylated anti-CD91 antibody (500 ng/ml. Abeam, Cambridge, 
UK) in 1% BSA was loaded to the upper 48 wells or the lower 48 wells, respectively. 
After 60 minutes incubation, plates were washed three times with PBS and then 
covered with 100 j^l/well of 1 XHRP-Streptavidin (Abeam) in 1% BSA solution. After 
30 minutes incubation, plates were washed three times with PBS and covered with 
100 |j.l/well of TMB Ready Solution (Thermo Fischer Scientific). The reaction was 
stopped after 10 minutes incubation using 100 |il/well of 2N HCl. The OD at 450 nm 
was immediately measured. The concentration data from CD9-CD9 and CD9-CD91 
ELISA assays were normalized with gradient curves (5, 10, 15, 20, and 25 )j.l) obtained 
from a common pleural effusion sample. Finally CD91 concentrations were 
normalized with exosome concentration calculated with CD9-CD9 ELISA above. 

Box plot analysis and ROC curve analysis. The intensities of mass spectrum peaks 
corresponding to candidate biomarker peptides were displayed by box plot using R 
algorithm. For each study the box represents the middle half of the distribution of the 
data points stretching from the 25* percentile to the 75* percentile. The line across 
the box represents the median. The lengths of the lines above and below the box are 
defined by the maximum and minimum data point values, respectively, that lie within 
1 .5 times the spread of the box. ROC curves were also depicted by R. The cut-off value 
was set at the point whose distance from the (sensitivity, specificity) — (1, 1) reached 
the minimum. The sensitivity (Sens), specificity (Spec), positive predictive value 
(PV-F), negative predictive value (PV-), and are under the curve (AUC) were shown 
on each graph. 
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